AITopics | Estado de México

Collaborating Authors

Estado de México

CAIRe: Cultural Attribution of Images by Retrieval-Augmented Evaluation

Yayavaram, Arnav, Yayavaram, Siddharth, Khanuja, Simran, Saxon, Michael, Neubig, Graham

arXiv.org Artificial IntelligenceNov-21-2025

As text-to-image models become increasingly prevalent, ensuring their equitable performance across diverse cultural contexts is critical. Efforts to mitigate cross-cultural biases have been hampered by trade-offs, including a loss in performance, factual inaccuracies, or offensive outputs. Despite widespread recognition of these challenges, an inability to reliably measure these biases has stalled progress. To address this gap, we introduce CAIRe, an evaluation metric that assesses the degree of cultural relevance of an image, given a user-defined set of labels. Our framework grounds entities and concepts in the image to a knowledge base and uses factual information to give independent graded judgments for each culture label. On a manually curated dataset of culturally salient but rare items built using language models, CAIRe surpasses all baselines by 22% F1 points. Additionally, we construct two datasets for culturally universal concepts, one comprising T2I-generated outputs and another retrieved from naturally occurring data. CAIRe achieves Pearson's correlations of 0.56 and 0.66 with human ratings on these sets, based on a 5-point Likert scale of cultural relevance. This demonstrates its strong alignment with human judgment across diverse image sources.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2506.09109

Country:

Asia > India (0.05)
Asia > Indonesia (0.05)
Europe > Ukraine (0.05)
(20 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
(3 more...)

Add feedback

Perceptions of AI Across Sectors: A Comparative Review of Public Attitudes

Bialy, Filip, Elliot, Mark, Meckin, Robert

arXiv.org Artificial IntelligenceSep-24-2025

Even though current generation of AI is underpinned by a common technology - namely machine learning, especially in the form of deep learning - in the public eye it has not emerged as a single solution. Rather, it has taken shape through multiple and overlapping applications - ranging from predictive diagnostics in healthcare and algorithmic hiring systems in HR to autonomous weapons and generative language models. As AI becomes increasingly embedded in sector - specific infrastructures, the question of how publics perceive its us e is gaining urgency. Existing literature on public perception of AI suggests that attitudes are highly sensitive to the application domain . People tend to be more supportive of AI in domains where it is perceived to augment human capacity (e.g., in medical diagnostics) and more sceptical when AI is seen as replacing judg e ment or threatening civil liberties or rights (e.g., in security or surveillance). These perceptions are shaped not only by technical features of the AI system but also by institutional trust, cultural attitude s toward risk, and the moral economy of the domain in question. Despite this, few reviews have systematically compared public perceptions across sectors and explored the cross - domain patterns and differences in attitudes.

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.18233

Country:

Asia > Middle East > UAE (0.27)
Asia > Japan (0.05)
Europe > Germany (0.05)
(51 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(15 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(6 more...)

Add feedback

Can Artificial Intelligence Write Like Borges? An Evaluation Protocol for Spanish Microfiction

Manzanarez, Gerardo Aleman, Arana, Nora de la Cruz, Flores, Jorge Garcia, Medina, Yobany Garcia, Monroy, Raul, Pernelle, Nathalie

arXiv.org Artificial IntelligenceJun-11-2025

Automated story writing has been a subject of study for over 60 years. Large language models can generate narratively consistent and linguistically coherent short fiction texts. Despite these advancements, rigorous assessment of such outputs for literary merit - especially concerning aesthetic qualities - has received scant attention. In this paper, we address the challenge of evaluating AI-generated microfictions and argue that this task requires consideration of literary criteria across various aspects of the text, such as thematic coherence, textual clarity, interpretive depth, and aesthetic quality. To facilitate this, we present GrAImes: an evaluation protocol grounded in literary theory, specifically drawing from a literary perspective, to offer an objective framework for assessing AI-generated microfiction. Furthermore, we report the results of our validation of the evaluation protocol, as answered by both literature experts and literary enthusiasts. This protocol will serve as a foundation for evaluating automatically generated microfictions and assessing their literary value.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.08172

Country:

Europe > France (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Mexico > Estado de México (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Media (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sentiment Analysis on the young people's perception about the mobile Internet costs in Senegal

Mbaye, Derguene, Seye, Madoune Robert, Diallo, Moussa, Ndiaye, Mamadou Lamine, Sow, Djiby, Adjanohoun, Dimitri Samuel, Mbengue, Tatiana, Wade, Cheikh Samba, Pablo, De Roulet, Munyaka, Jean-Claude Baraka, Chenal, Jerome

arXiv.org Artificial IntelligenceApr-21-2025

Internet penetration rates in Africa are rising steadily, and mobile Internet is getting an even bigger boost with the availability of smartphones. Young people are increasingly using the Internet, especially social networks, and Senegal is no exception to this revolution. Social networks have become the main means of expression for young people. Despite this evolution in Internet access, there are few operators on the market, which limits the alternatives available in terms of value for money. In this paper, we will look at how young people feel about the price of mobile Internet in Senegal, in relation to the perceived quality of the service, through their comments on social networks. We scanned a set of Twitter and Facebook comments related to the subject and applied a sentiment analysis model to gather their general feelings.

artificial intelligence, natural language, social media, (17 more...)

arXiv.org Artificial Intelligence

2504.13284

Country:

Africa > Sudan (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
(15 more...)

Genre: Research Report (0.83)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)

Add feedback

Tec-Habilidad: Skill Classification for Bridging Education and Employment

Butt, Sabur, Ceballos, Hector G., Madera, Diana P.

arXiv.org Artificial IntelligenceMar-5-2025

Job application and assessment processes have evolved significantly in recent years, largely due to advancements in technology and changes in the way companies operate. Skill extraction and classification remain an important component of the modern hiring process as it provides a more objective way to evaluate candidates and automatically align their skills with the job requirements. However, to effectively evaluate the skills, the skill extraction tools must recognize varied mentions of skills on resumes, including direct mentions, implications, synonyms, acronyms, phrases, and proficiency levels, and differentiate between hard and soft skills. While tools like LLMs (Large Model Models) help extract and categorize skills from job applications, there's a lack of comprehensive datasets for evaluating the effectiveness of these models in accurately identifying and classifying skills in Spanish-language job applications. This gap hinders our ability to assess the reliability and precision of the models, which is crucial for ensuring that the selected candidates truly possess the required skills for the job. In this paper, we develop a Spanish language dataset for skill extraction and classification, provide annotation methodology to distinguish between knowledge, skill, and abilities, and provide deep learning baselines to advance robust solutions for skill classification.

classification, dataset, skill identification, (13 more...)

arXiv.org Artificial Intelligence

2503.03932

Country:

North America > United States (0.14)
South America (0.04)
North America > Mexico > Estado de México (0.04)
North America > Central America (0.04)

Genre: Research Report (1.00)

Industry:

Education > Curriculum (0.94)
Automobiles & Trucks > Manufacturer (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks

Jiang, Ziyan, Meng, Rui, Yang, Xinyi, Yavuz, Semih, Zhou, Yingbo, Chen, Wenhu

arXiv.org Artificial IntelligenceJan-2-2025

Embedding models have been crucial in enabling various downstream tasks such as semantic similarity, information retrieval, and clustering. Recently, there has been a surge of interest in developing universal text embedding models that can generalize across tasks (e.g., MTEB). However, progress in learning universal multimodal embedding models has been relatively slow despite its importance and practicality. In this work, we aim to explore the potential of building universal multimodal embeddings capable of handling a wide range of downstream tasks. Our contributions are two fold: (1) we propose MMEB (Massive Multimodal Embedding Benchmark), which covers 4 meta-tasks (i.e. We show that VLMs are secretly strong embedding models. Embeddings, or distributed representations, encode inputs (whether text or images) as fixed-dimensional vectors, enabling a range of downstream tasks. A recent shift in research has focused on developing universal embeddings that can generalize across a wide range of tasks. For instance, Muennighoff et al. (2023) introduced MTEB (Massive Text Embedding Benchmark) to comprehensively assess text embeddings across tasks such as classification and clustering. MTEB has become the standard for evaluating universal text embeddings. Recent works (Wang et al., 2022a; Su et al., 2023; Wang et al., 2024; Springer et al., 2024; BehnamGhader et al., 2024) have demonstrated promising results on the MTEB benchmark. However, progress in multimodal embeddings has been relatively slower. Work done during an internship at University of Waterloo in collaboration with Salesforce Research. Instruction: Represent the given news image with the Instruction: Represent the given image and the following caption for domain classification.

dataset, instruction, proceedings, (11 more...)

arXiv.org Artificial Intelligence

2410.0516

Country:

North America > United States > Texas > Hays County > San Marcos (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg (0.04)
(10 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Sports > Tennis (0.68)
Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing

Zhang, Mengqi, Fang, Bowen, Liu, Qiang, Ren, Pengjie, Wu, Shu, Chen, Zhumin, Wang, Liang

arXiv.org Artificial IntelligenceAug-22-2024

Large language models (LLMs) face challenges with internal knowledge inaccuracies and outdated information. Knowledge editing has emerged as a pivotal approach to mitigate these issues. Although current knowledge editing techniques exhibit promising performance in single-hop reasoning tasks, they show limitations when applied to multi-hop reasoning. Drawing on cognitive neuroscience and the operational mechanisms of LLMs, we hypothesize that the residual single-hop knowledge after editing causes edited models to revert to their original answers when processing multi-hop questions, thereby undermining their performance in multihop reasoning tasks. To validate this hypothesis, we conduct a series of experiments that empirically confirm our assumptions. Building on the validated hypothesis, we propose a novel knowledge editing method that incorporates a Knowledge Erasure mechanism for Large language model Editing (KELE). Specifically, we design an erasure function for residual knowledge and an injection function for new knowledge. Through joint optimization, we derive the optimal recall vector, which is subsequently utilized within a rank-one editing framework to update the parameters of targeted model layers. Extensive experiments on GPT-J and GPT-2 XL demonstrate that KELE substantially enhances the multi-hop reasoning capability of edited LLMs.

knowledge, llm, multi-hop question, (17 more...)

arXiv.org Artificial Intelligence

2408.12456

Country:

North America > United States (0.95)
Asia > China > Hong Kong (0.05)
Africa (0.05)
(7 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

A quantitative and typological study of Early Slavic participle clauses and their competition

Pedrazzini, Nilo

arXiv.org Artificial IntelligenceMay-8-2024

This thesis is a corpus-based, quantitative, and typological analysis of the functions of Early Slavic participle constructions and their finite competitors ($jegda$-'when'-clauses). The first part leverages detailed linguistic annotation on Early Slavic corpora at the morphosyntactic, dependency, information-structural, and lexical levels to obtain indirect evidence for different potential functions of participle clauses and their main finite competitor and understand the roles of compositionality and default discourse reasoning as explanations for the distribution of participle constructions and $jegda$-clauses in the corpus. The second part uses massively parallel data to analyze typological variation in how languages express the semantic space of English $when$, whose scope encompasses that of Early Slavic participle constructions and $jegda$-clauses. Probabilistic semantic maps are generated and statistical methods (including Kriging, Gaussian Mixture Modelling, precision and recall analysis) are used to induce cross-linguistically salient dimensions from the parallel corpus and to study conceptual variation within the semantic space of the hypothetical concept WHEN.

compositionality and default discourse reasoning, jegda-clause and temporal relation interpretation, predictable participle lemma-subject lemma combination, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.5287/ora-8gv0b4qyo

2405.01972

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.13)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.13)
(75 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Media (0.92)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Add feedback

SeeGULL Multilingual: a Dataset of Geo-Culturally Situated Stereotypes

Bhutani, Mukul, Robinson, Kevin, Prabhakaran, Vinodkumar, Dave, Shachi, Dev, Sunipa

arXiv.org Artificial IntelligenceMar-8-2024

While generative multilingual models are rapidly being deployed, their safety and fairness evaluations are largely limited to resources collected in English. This is especially problematic for evaluations targeting inherently socio-cultural phenomena such as stereotyping, where it is important to build multi-lingual resources that reflect the stereotypes prevalent in respective language communities. However, gathering these resources, at scale, in varied languages and regions pose a significant challenge as it requires broad socio-cultural knowledge and can also be prohibitively expensive. To overcome this critical gap, we employ a recently introduced approach that couples LLM generations for scale with culturally situated validations for reliability, and build SeeGULL Multilingual, a global-scale multilingual dataset of social stereotypes, containing over 25K stereotypes, spanning 20 languages, with human annotations across 23 regions, and demonstrate its utility in identifying gaps in model evaluations. Content warning: Stereotypes shared in this paper can be offensive.

dataset, stereotype, wikipedia, (14 more...)

arXiv.org Artificial Intelligence

2403.05696

Country:

Asia > India (0.06)
Europe > Spain (0.05)
Europe > Portugal (0.05)
(59 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

GlotLID: Language Identification for Low-Resource Languages

Kargaran, Amir Hossein, Imani, Ayyoob, Yvon, François, Schütze, Hinrich

arXiv.org Artificial IntelligenceNov-4-2023

Several recent papers have published good solutions for language identification (LID) for about 300 high-resource and medium-resource languages. However, there is no LID available that (i) covers a wide range of low-resource languages, (ii) is rigorously evaluated and reliable and (iii) efficient and easy to use. Here, we publish GlotLID-M, an LID model that satisfies the desiderata of wide coverage, reliability and efficiency. It identifies 1665 languages, a large increase in coverage compared to prior work. In our experiments, GlotLID-M outperforms four baselines (CLD3, FT176, OpenLID and NLLB) when balancing F1 and false positive rate (FPR). We analyze the unique challenges that low-resource LID poses: incorrect corpus metadata, leakage from high-resource languages, difficulty separating closely related languages, handling of macrolanguage vs varieties and in general noisy data. We hope that integrating GlotLID-M into dataset creation pipelines will improve quality and enhance accessibility of NLP technology for low-resource languages and cultures. GlotLID-M model, code, and list of data sources are available: https://github.com/cisnlp/GlotLID.

language identification, natural language processing, resource and evaluation conference, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.findings-emnlp.410

2310.16248

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
South America > Peru > Huánuco Department > Huánuco Province > Huánuco (0.04)
North America > Mexico > Puebla (0.04)
(84 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Media > Television (0.45)
Health & Medicine > Therapeutic Area > Neurology (0.33)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback